feat: consolidate image and bundle caches into LocalDB#790
feat: consolidate image and bundle caches into LocalDB#790
Conversation
Adds PersistentCacheImageChecker that caches verified image URIs to ~/.flyte/cache/images/. After the first successful remote manifest check (~4s), subsequent runs read from disk cache (~0ms), eliminating repeated network round-trips to the container registry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Kevin Su <pingsutw@apache.org>
| logger.debug(f"Image {image_uri} in registry") | ||
| return image_uri | ||
| # Persist to disk so future process invocations skip network checks | ||
| if checker is not PersistentCacheImageChecker: |
There was a problem hiding this comment.
One of the problem is the persistent cache never invalidates. If an image is deleted from the registry, the cache will still say it exists, and the build will be skipped
There was a problem hiding this comment.
but it makes UX way more better
There was a problem hiding this comment.
We can solve this, by just having a very short TTL on the cache. this is why i am suggesting using sqlite. Anyways the data is tiny and one row is enough?
There was a problem hiding this comment.
Short TTL sounds good to me. I'll update it to use sqlite
There was a problem hiding this comment.
Done. Add TTL and cache for code bundle.
|
@pingsutw this is great, i was thinking of adding this and even code bundle hash to the local cache. Shall we just use the sqlite table to store this? |
|
@pingsutw I think we should just use one sqlite db. can you please use local persistence for this. This way it will be easier to cache this, clear it etc |
Signed-off-by: Kevin Su <pingsutw@apache.org>
| """Look up a previously uploaded bundle by its file digest. Returns (hash_digest, remote_path) or None.""" | ||
| from flyte._persistence._db import LocalDB | ||
|
|
||
| try: |
There was a problem hiding this comment.
we should not crash if database not found
| print_ls_tree(from_dir, files) | ||
|
|
||
| # Check persistent cache before creating the tar bundle to avoid unnecessary work | ||
| if not dryrun: |
There was a problem hiding this comment.
can you run the stress/runs_per_second to ensure it does not break
| print_ls_tree(from_dir, files) | ||
|
|
||
| # Check persistent cache before creating the tar bundle to avoid unnecessary work | ||
| if not dryrun: |
There was a problem hiding this comment.
i have a cache in _run.py. should we just add it there instead of here? it feels weird that this is using sqlite?
There was a problem hiding this comment.
I guess we will have to then repeat it for deploy and serve i guess
Summary
LocalDBdatabase (~/.flyte/local-cache/cache.db)docker manifest inspect), subsequent runs hit the local SQLite cache (~0ms). Cache key is a SHA-256 of repository, tag, and architecture.list_files_to_bundlereturns the same digest as a previous run, the upload is skipped entirely and the cached remote path is reused. The cache check happens beforecreate_bundleso tar creation is also avoided on cache hit.PersistentCacheImageCheckeris inserted as the first checker in theDockerImageBuilderchecker chain, before network-based checkers.Key changes
_persistence/_db.py: Addedimage_cacheandbundle_cachetable DDLs toLocalDB, with a unified_ALL_TABLE_DDLSlist for DRY initialization across sync and async pathsimage_builder.py: Removed standalone~/.flyte/cache/images.db— reads/writes now go throughLocalDB.get_sync()with_write_lockfor thread safetybundle.py: Removed standalone~/.flyte/cache/bundles.db— same consolidation pattern; uses lazy import ofLocalDBto avoid circular imports_task_cache.py: Added missing_write_lockto_set_syncandclearfor consistency with all other sync write pathsTest plan
test_cached— verifies persistent cache is checked first and other checkers are skippedtest_persistent_cache_write_and_read— verifies write/read round-trip and arch isolation via SQLiteLOG_LEVEL=10 python examples/basics/hello.py— first run writes cache, second run shows "found in persistent cache" in ~0ms🤖 Generated with Claude Code